import pandas as pd
import matplotlib.pyplot as plt
# Load your DataFrame from the CSV file
df_csv_wide = pd.read_csv('all_ticks_wide.csv')
df = pd.DataFrame(df_csv_wide)
# Convert the 'timestamp' column to datetime
df['timestamp'] = pd.to_datetime(df['timestamp'])
# Sort the DataFrame by the 'timestamp' column
df = df.sort_values('timestamp')
# Drop rows with missing values
df = df.dropna()
# Define the list of stock symbols
stocks = ['AKBNK', 'CCOLA', 'MGROS', 'OTKAR', 'PGSUS', 'TCELL']
# Filter the DataFrame to a 2-year time span for the selected stocks
start_timestamp = df['timestamp'].min()
end_timestamp = start_timestamp + pd.DateOffset(years=2)
df_2_years_span = df[(df['timestamp'] >= start_timestamp) & (df['timestamp'] < end_timestamp)]
# Set the 'timestamp' column as the index to make it the first column
df_2_years_span.set_index('timestamp', inplace=True)
# Slice the DataFrame to select only the specified stocks
df_2_years_span = df_2_years_span[stocks]
print(df_2_years_span)
AKBNK CCOLA MGROS OTKAR PGSUS TCELL timestamp 2016-03-28 06:30:00+00:00 6.2904 34.3489 17.70 96.5881 16.74 9.8841 2016-03-28 14:15:00+00:00 6.2510 34.2188 17.55 95.7742 16.73 9.8089 2016-03-28 14:30:00+00:00 6.2510 34.1818 17.54 95.7742 16.73 9.7755 2016-03-29 06:30:00+00:00 6.2904 34.1818 17.60 95.9551 16.91 9.8589 2016-03-29 14:15:00+00:00 6.2432 34.6833 17.27 95.5028 17.01 9.5914 ... ... ... ... ... ... ... 2018-03-27 06:45:00+00:00 8.1184 34.2218 24.04 115.8151 36.08 14.1050 2018-03-27 09:45:00+00:00 8.1431 34.2406 23.92 112.9927 35.98 14.0402 2018-03-27 10:45:00+00:00 8.1431 34.1468 23.88 112.9927 35.92 14.0402 2018-03-27 14:45:00+00:00 7.9951 34.1468 23.56 112.9927 35.02 14.0031 2018-03-27 15:00:00+00:00 7.9786 33.9969 23.58 112.7981 35.00 14.0031 [1283 rows x 6 columns]
In the world of finance, understanding the dynamics of stock markets is paramount for investors, analysts, and policymakers alike. This report delves into a comprehensive analysis of the stock prices of six prominent companies: AKBNK, CCOLA, MGROS, OTKAR, PGSUS, and TCELL, over a specific period. We aim to unravel the factors that are the strong contributors of stock price evolution concealed within the financial data by employing statistical techniques, market dynamics, and the power of online search trends.
Our approach begins with a comprehensive analysis of the monthly evolution of stock prices for the selected companies. To detect potential outliers, we utilize two distinct methodologies: the Interquartile Range (IQR) and the 3-sigma rule. The IQR method helps us identify anomalies through box plots, while the 3-sigma rule involves calculating monthly means and standard deviations. Data points falling outside the range of (mean - 3 standard deviations) and (mean + 3 standard deviations) are marked as outliers. This analysis allows us to pinpoint specific instances of stock price movements that deviate significantly from the norm. For each identified outlier, we delve into preliminary research and market dynamics to shed light on the potential driving forces.
Recognizing that stock prices are often influenced by external factors such as news, events, and public sentiment, we incorporate Google Trends data into our analysis. Google Trends provides valuable insights into the popularity of specific search keywords, which are highly correlated with sales and market behavior. We pinpoint relevant keywords associated with the selected stocks and use Google Trends to test for events or relevant news coinciding with the dates of the detected outliers.
By the end, we aim to provide a comprehensive view of the stock price evolution of these companies, offering a valuable tool for understanding the ever-changing world of finance.
The data used in this analysis comprises the closing prices of selected stocks listed on Borsa İstanbul, Turkey's main stock exchange. These closing prices have been collected at a high frequency, with observations recorded at 15-minute intervals. The data collection period spans from March 2016 to March 2018, providing a two-year timeframe for the analysis. The selected stocks for this analysis include AKBNK, CCOLA, MGROS, OTKAR, PGSUS, and TCELL. These stocks are representative of different sectors and have been chosen to gain insights into various aspects of the stock market.
The world of stock prices is a complex landscape, where a huge amount of factors can influence price movements. From global economic trends and geopolitical events to company-specific news and investor decisions, there are countless variables at play. The interplay of market forces, industry dynamics, government policies, and even unexpected black swan events can all contribute to stock price fluctuations.
For the case of Turkey, we need to consider that Turkey experienced significant political instability, including a failed coup attempt in July 2016, a constitutional referendum in April 2017, and ongoing tensions with various countries. These events likely had a significant impact on the overall Turkish economy and stock market. The Turkish Lira experienced significant depreciation during this period, which could have affected companies with foreign currency debt or those that rely heavily on imports. Yet, the country managed to announce an economic growth of 7.3% in 2017, which could be considered as noticeable from the positive tendency in prices of all of the 6 companies except for Otokar.
# Extract the year and month from the dates
df_2_years_span['Year'] = df_2_years_span.index.year
df_2_years_span['Month'] = df_2_years_span.index.month
# Create a dictionary to store outliers for each stock
outliers_dict = {stock: {month: [] for month in range(1, 26)} for stock in stocks}
# Create subplots for each stock
fig, axes = plt.subplots(6, 1, figsize=(20, 70), sharex=False)
axes = axes.flatten()
# Create a list of dates for the x-axis
dates = pd.date_range(start='2016-03-28', periods=25, freq='M')
for idx, stock in enumerate(stocks):
ax = axes[idx]
for month in range(3, 28):
if month in range(3,13):
year = 2016
elif month in range(13, 25):
year=2017
else:
year=2018
if month % 12 == 0:
tmpMonth = 12
else:
tmpMonth = month % 12
month -= 2
filtered_data = df_2_years_span[(df_2_years_span['Year'] == year) & (df_2_years_span['Month'] == tmpMonth)]
# Customize the appearance of outliers
flierprops = dict(marker='o', markerfacecolor='red', markersize=8, linestyle='none')
boxplot = ax.boxplot(filtered_data[stock], positions=[month], widths=0.2, flierprops=flierprops)
# Identify and annotate outliers
outliers = boxplot['fliers'][0].get_data()[1]
outliers_dict[stock][month].extend(outliers)
ax.set_xticks(range(1, 26))
# Set font size for x-axis tick labels
ax.tick_params(axis='x', labelsize=15)
# Set font size for y-axis tick labels
ax.tick_params(axis='y', labelsize=15)
ax.set_xticklabels([date.strftime('%b %Y') for date in dates], rotation=45)
ax.set_xlabel('Date', fontsize=20)
ax.set_ylabel('Stock Price', fontsize=20)
ax.set_title(stock)
plt.tight_layout()
# Print the list of all outliers for each stock
for stock in stocks:
for month in range(3, 28):
if month in range(3,13):
year = 2016
elif month in range(13, 25):
year=2017
else:
year=2018
if month % 12 == 0:
tmpMonth = 12
else:
tmpMonth = month % 12
month = month - 2
outlier_values = outliers_dict[stock][month]
if len(outlier_values) > 0:
print(f'Outliers in {stock} for {tmpMonth}-{year}: {outlier_values}')
plt.show()
Outliers in AKBNK for 5-2016: [6.8525, 6.7483] Outliers in AKBNK for 11-2016: [6.6681] Outliers in AKBNK for 1-2017: [6.7563, 6.7322] Outliers in AKBNK for 3-2017: [7.4135] Outliers in AKBNK for 10-2017: [7.4769, 7.4769] Outliers in AKBNK for 1-2018: [9.2124] Outliers in CCOLA for 5-2016: [38.2501, 38.2501, 36.7454, 36.6897] Outliers in CCOLA for 7-2016: [37.598, 37.4862, 38.0454] Outliers in CCOLA for 8-2016: [33.4224] Outliers in CCOLA for 3-2017: [32.1363] Outliers in CCOLA for 7-2017: [40.3878, 41.0811, 41.4935, 41.2686, 40.7063, 40.4252] Outliers in CCOLA for 10-2017: [33.266, 33.3597, 33.3034, 33.4347, 33.4534, 33.5097, 33.7158, 33.7346, 33.5471, 34.0344, 34.0344, 34.0344, 36.1709, 36.1709] Outliers in CCOLA for 11-2017: [32.9474, 32.8162, 36.3583, 37.3516, 36.5458, 36.5458] Outliers in CCOLA for 12-2017: [33.4721] Outliers in MGROS for 5-2016: [19.53, 19.3] Outliers in MGROS for 10-2016: [18.67, 18.57] Outliers in MGROS for 12-2016: [16.66, 16.61, 16.55] Outliers in MGROS for 8-2017: [28.52, 28.54] Outliers in MGROS for 10-2017: [24.32, 24.34, 24.32, 24.4, 24.32, 24.32, 24.5] Outliers in OTKAR for 4-2016: [97.583, 103.0843, 103.642] Outliers in OTKAR for 5-2016: [99.1803, 97.6001, 94.7185, 94.9045] Outliers in OTKAR for 7-2016: [99.1803, 98.9945, 98.7155] Outliers in OTKAR for 11-2017: [126.7825] Outliers in OTKAR for 12-2017: [102.0144, 105.2409, 104.9562, 105.0511, 107.2337, 107.3286, 107.8031, 106.949, 107.4235, 107.4235, 107.0439] Outliers in OTKAR for 1-2018: [123.1764, 123.2713] Outliers in OTKAR for 3-2018: [113.402, 113.402, 114.3552, 115.3285, 114.7445, 115.8151] Outliers in PGSUS for 12-2016: [13.55, 13.54, 13.35, 13.32, 14.15] Outliers in PGSUS for 2-2017: [15.46, 15.27] Outliers in PGSUS for 3-2017: [15.67, 15.65, 15.58, 15.6, 15.6, 15.54] Outliers in PGSUS for 12-2017: [26.6, 26.32, 26.3, 26.3] Outliers in PGSUS for 3-2018: [37.8, 37.78] Outliers in TCELL for 5-2016: [10.0346, 10.0179, 9.9009, 9.8505, 9.7586] Outliers in TCELL for 7-2016: [9.4493, 9.4576, 9.4409, 9.349, 9.3072] Outliers in TCELL for 11-2016: [7.8604, 7.8101, 8.312] Outliers in TCELL for 3-2017: [10.4275] Outliers in TCELL for 8-2017: [11.6013] Outliers in TCELL for 11-2017: [13.4123, 13.3582, 13.6555, 13.7005] Outliers in TCELL for 2-2018: [13.5483, 13.5576]
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Extract the year and month from the dates
df_2_years_span['Year'] = df_2_years_span.index.year
df_2_years_span['Month'] = df_2_years_span.index.month
# Create subplots for each month
num_stocks = len(stocks)
num_months = 25
num_rows = num_stocks
num_cols = num_months
fig, axes = plt.subplots(num_rows, num_cols, figsize=(20, 30), sharex=False, sharey=False)
sigma = 3
for stock_idx, stock in enumerate(stocks):
stock_title = f'{stock}' # Set a single title for each stock
for month in range(3, 28):
if month in range(3, 13):
year = 2016
elif month in range(13, 25):
year = 2017
else:
year = 2018
if month % 12 == 0:
tmpMonth = 12
else:
tmpMonth = month % 12
month -= 2
filtered_data = df_2_years_span[(df_2_years_span['Year'] == year) & (df_2_years_span['Month'] == tmpMonth)]
# Calculate control limits
mean = filtered_data[stock].mean()
std = filtered_data[stock].std()
upper_limit = mean + sigma * std
lower_limit = mean - sigma * std
# Plot the data points
x = np.arange(1, len(filtered_data) + 1)
y = filtered_data[stock]
ax = axes[stock_idx, month - 3]
yTmp = np.array(y)
# Find outliers
outliers = np.where((y > upper_limit) | (y < lower_limit))
non_outliers = np.where((y <= upper_limit) & (y >= lower_limit))
# Plot the data points, marking outliers differently
ax.plot(x[non_outliers], yTmp[non_outliers], label='Data', marker='o', linestyle='-', color='b',markersize=2)
ax.plot(x[outliers], yTmp[outliers], label='Outliers', marker='x', linestyle='', color='r')
# Create arrays for control limits with the same shape as the data
control_limits_upper = np.full(len(filtered_data), upper_limit)
control_limits_lower = np.full(len(filtered_data), lower_limit)
control_limits_center = np.full(len(filtered_data), mean)
# Plot the control limits
ax.plot(x, control_limits_upper, linestyle='--', color='g', label=f'{sigma}-sigma Upper Limit')
ax.plot(x, control_limits_lower, linestyle='--', color='g', label=f'{sigma}-sigma Lower Limit')
ax.plot(x, control_limits_center, linestyle='--', color='g', label=f'{sigma}-sigma Center Limit')
ax.tick_params(axis='x')
ax.tick_params(axis='y')
ax.set_ylim(min(lower_limit, yTmp.min()) -1, max(upper_limit , yTmp.max()) +1) # Dynamic y-range
if (len(yTmp[outliers]) > 0): print("Outliers for" , f'{stock}: {tmpMonth-2}-{year} : {yTmp[outliers]}')
ax = axes[stock_idx, 0]
ax.set_title(stock_title, fontsize=15)
ax.set_ylabel('Stock Price')
# Adjust the layout
plt.show()
Outliers for AKBNK: 3-2016 : [6.8525] Outliers for AKBNK: 1-2017 : [7.4135] Outliers for CCOLA: 3-2016 : [38.2501 38.2501] Outliers for CCOLA: 9-2017 : [37.3516] Outliers for OTKAR: 3-2016 : [99.1803] Outliers for OTKAR: 10-2017 : [102.0144] Outliers for OTKAR: 1-2018 : [115.8151] Outliers for PGSUS: 10-2016 : [14.15] Outliers for TCELL: 9-2017 : [13.6555 13.7005] Outliers for TCELL: 0-2018 : [13.5483 13.5576]
For a company-wise, more specific analysis, one can carefully focus on the output and determine the factors or events that may have played a critical role in price fluctuations. Having said that, by looking at the overall pace for AKBNK, lowest prices are observed from Nov 2016 to Jan 2017. One of the reasons behind such an incline could be a critical event such as Moody’s downgrade of Turkey’s sovereign credit rating in Sep 2016.
In order to delve deep, another focus point could be an analysis of outliers. These outliers represent instances where stock prices deviated significantly from the expected pace as determined by the 3-sigma rule. Following that, we have determined the outliers for the stock prices of AKBNK which belong to May, Nov 2016, Jan, Mar, Oct 2017, and Jan 2018, respectively. For each identified outlier, a thorough analysis was conducted to analyze the underlying factors that might have contributed to these price movements. For example, there are some outliers in Jan 2017 performing better than the expected behavior of the stock prices. With a simple desktop research, we have found out that Akbank has been selected as the “Most Successful Brand” within the banking industry by the ONE Awards. Although this cannot be stated as the sole reason for such a fluctuation in the stock market, it may be a good starting point for the understanding of market dynamics.
Coca-Cola, a global beverage giant, is known for its diverse range of products. CCOLA is the stock of Coca-Cola İçecek, which is the franchise of the company in Turkey. One of the most prominent factors that may have an effect on CCOLA prices is the impact of changing seasons. Since Coca-Cola offers a wide range of cold beverages, the best season for the company is summer. Focusing on the overall trend for CCOLA, the ups with summer and downs with winter is a very much notable observation.
Regarding outliers, CCOLA provides a more insightful output. The outliers are observed to be cumulated in the months of May, Jul 2016 and Mar, Jul, Oct, Nov 2017. Focusing on July 2017, it can be easily observed that there are a number of outliers that outperform the expected stock price behavior. After conducting a comprehensive research, we observed that there is an important update in the company that might be the reason: A well-deserved businesswoman is promoted as the president of the Turkey, Caucasus and Central Asia business unit.
Migros is one of the greatest grocery chains in Turkey. Looking at the box chart of the MGROS stock, the first thing to notice is that there is a significant shift in the period from Jan 2017 to Aug 2017. An influential factor shaping such a long-range increase could be an improvement in economic conditions of the country or an announcement of growth or expansion plans.
In terms of outliers, MGROS does not provide enough data to be able to analyze. The outliers are distributed to distinct months like May, Oct, Dec 2016, and Aug, Oct 2017. Yet, we might have a look at the interquartile ranges of MGROS. They tend to be narrower compared to other stock prices. This is generally an indication of a much more consistent environment and can be considered a safe place to invest in.
Another company of focus is Otokar. Looking at the box chart of the OTKAR stock, it is observed that the interquartile ranges are very wide, especially from Nov 2016 to Nov 2017. A longer IQR indicates that the stock prices have experienced more significant price fluctuations during that month, suggesting that the stock's price has been more volatile. Additionally, just like in the case of Migros, in 2017, there is a significant shift in price. Such a result could stem from an improvement in economic conditions of the country, an announcement of earnings reports, or external shocks.
In terms of outliers, it can be observed that there are some outliers cumulating around May 2016 and Dec 2017. Considering that Otokar has announced a reception of an order amount of €65M in May 2016, higher stock prices in that month may become explainable.
Shifting our attention to Pegasus, notably, the interquartile ranges exhibit an increase in width, particularly during the period spanning from July 2017 to December 2017. As stated, wider interquartile ranges signify price fluctuations for the stock throughout that timeframe.
In the context of outliers, the PGSUS dataset provides limited data for robust analysis. The available outlier data is distributed across distinct months, including December 2016, as well as February, March, and December 2017.
Now, let's turn our focus to Turkcell. Notably, TCELL in 2017 underwent a substantial shift in stock prices with a relatively static interquartile range. While the underlying drivers of this shift could be considered as improvements in the economic landscape of Turkey, there is also the possibility of an improvement in terms of sectoral growth, considering the great advancements in technology.
In the realm of outlier analysis, the dataset for Turkcell offers limited data for a comprehensive examination. However, we could comment on the higher stock prices in November 2017. In that month, Turkcell announced lots of improving news company-wise such as the procurement of 4G and a huge sale of company real estates.
# Grouping by'Year' and 'Month' columns, and taking the mean for monthly stock prices
grouped_means = df_2_years_span.groupby(['Year', 'Month']).mean()
hisse_columns = ['AKBNK', 'CCOLA', 'MGROS', 'OTKAR', 'PGSUS', 'TCELL']
grouped_means
| AKBNK | CCOLA | MGROS | OTKAR | PGSUS | TCELL | ||
|---|---|---|---|---|---|---|---|
| Year | Month | ||||||
| 2016 | 3 | 6.313580 | 35.383650 | 17.554000 | 96.398200 | 16.958000 | 9.744440 |
| 4 | 6.635565 | 37.219850 | 18.797917 | 100.727310 | 17.100625 | 9.797998 | |
| 5 | 6.263860 | 33.377233 | 17.302326 | 90.599516 | 15.355581 | 9.166835 | |
| 6 | 6.388429 | 33.728376 | 16.787347 | 91.189263 | 13.983469 | 8.820312 | |
| 7 | 6.458234 | 34.520581 | 16.173125 | 94.367128 | 12.529375 | 8.899109 | |
| 8 | 6.380540 | 35.377543 | 16.542857 | 95.595031 | 12.377143 | 8.507255 | |
| 9 | 6.544443 | 34.959091 | 18.312500 | 97.980398 | 12.103409 | 8.274507 | |
| 10 | 6.592274 | 34.443921 | 17.634359 | 96.046149 | 12.771538 | 8.353303 | |
| 11 | 6.309060 | 31.128490 | 17.408667 | 103.123073 | 13.033000 | 8.058867 | |
| 12 | 6.206458 | 30.727497 | 17.730789 | 115.349176 | 12.662105 | 7.725066 | |
| 2017 | 1 | 6.262686 | 33.366227 | 17.979412 | 125.456737 | 14.763333 | 8.788435 |
| 2 | 7.032384 | 34.887290 | 19.786452 | 129.120032 | 16.002258 | 9.520465 | |
| 3 | 7.114857 | 33.426527 | 21.418649 | 118.295424 | 14.698378 | 10.161597 | |
| 4 | 7.511865 | 32.758671 | 23.128387 | 119.095823 | 15.287097 | 10.166497 | |
| 5 | 7.737458 | 34.014040 | 25.609667 | 125.398578 | 17.126833 | 9.838935 | |
| 6 | 8.036492 | 35.873235 | 26.807059 | 110.575563 | 18.935490 | 10.042322 | |
| 7 | 8.452770 | 38.839380 | 28.165000 | 115.568823 | 21.496000 | 10.587110 | |
| 8 | 8.515671 | 38.829157 | 29.289706 | 111.618512 | 25.802059 | 11.144031 | |
| 9 | 8.220367 | 36.102459 | 27.055455 | 104.140920 | 27.363636 | 11.166838 | |
| 10 | 7.880310 | 34.801347 | 25.910769 | 100.399904 | 27.824359 | 12.160541 | |
| 11 | 7.901326 | 34.918356 | 26.404242 | 112.949100 | 27.953636 | 12.842048 | |
| 12 | 7.651815 | 32.040852 | 25.127160 | 113.423112 | 31.351358 | 13.805280 | |
| 2018 | 1 | 8.360005 | 33.838879 | 26.543951 | 114.727060 | 34.393580 | 14.307528 |
| 2 | 8.635036 | 31.571036 | 24.666575 | 113.071829 | 35.140274 | 14.196249 | |
| 3 | 8.458001 | 34.968162 | 23.970000 | 110.662196 | 35.806757 | 13.973924 |
for column in hisse_columns:
plt.figure(figsize=(12, 6))
grouped_means[column].plot(title=f"{column} Monthly Mean Stock Prices", xlabel="Date", ylabel="Mean Prices")
# Putting the month names on x axis in year, month format
plt.xticks(range(len(grouped_means.index)), [f"({x[0]}, {x[1]})" for x in grouped_means.index], rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()
# Create data frames including Google Trends query scores for each stock
# Query for "akbank temettü" is changed as "akbank temettu" to conform to the English keyboard
df_csv_akbnk = pd.read_csv('akbnk.csv')
df_akbnk = pd.DataFrame(df_csv_akbnk)
df_csv_ccola = pd.read_csv('ccola.csv')
df_ccola = pd.DataFrame(df_csv_ccola)
df_csv_mgros = pd.read_csv('mgros.csv')
df_mgros = pd.DataFrame(df_csv_mgros)
df_csv_otkar = pd.read_csv('otkar.csv')
df_otkar = pd.DataFrame(df_csv_otkar)
df_csv_pgsus = pd.read_csv('pgsus.csv')
df_pgsus = pd.DataFrame(df_csv_pgsus)
df_csv_tcell = pd.read_csv('tcell.csv')
df_tcell = pd.DataFrame(df_csv_tcell)
print(df_akbnk)
print(df_ccola)
print(df_mgros)
print(df_otkar)
print(df_pgsus)
print(df_tcell)
Week akbank temettu akbank hisse akbnk
0 2016-03-27 0 58 0
1 2016-04-03 0 62 15
2 2016-04-10 0 64 18
3 2016-04-17 0 69 13
4 2016-04-24 0 38 9
.. ... ... ... ...
100 2018-02-25 5 56 16
101 2018-03-04 0 84 13
102 2018-03-11 7 57 17
103 2018-03-18 0 74 37
104 2018-03-25 30 87 16
[105 rows x 4 columns]
Week ccola coca cola hisse ccola hisse
0 2016-03-27 18 38 0
1 2016-04-03 13 42 0
2 2016-04-10 24 0 0
3 2016-04-17 0 24 0
4 2016-04-24 0 30 0
.. ... ... ... ...
100 2018-02-25 90 69 54
101 2018-03-04 54 22 28
102 2018-03-11 58 22 38
103 2018-03-18 46 64 18
104 2018-03-25 79 47 65
[105 rows x 4 columns]
Week mgros migros hisse migros hisse yorum
0 2016-03-27 0 27 0
1 2016-04-03 13 33 0
2 2016-04-10 14 0 0
3 2016-04-17 40 17 0
4 2016-04-24 9 23 10
.. ... ... ... ...
100 2018-02-25 49 0 0
101 2018-03-04 50 65 0
102 2018-03-11 64 100 0
103 2018-03-18 68 69 0
104 2018-03-25 38 69 0
[105 rows x 4 columns]
Week otkar otkar hisse otokar hisse
0 2016-03-27 20 11 26
1 2016-04-03 34 16 20
2 2016-04-10 27 11 9
3 2016-04-17 21 4 17
4 2016-04-24 11 9 13
.. ... ... ... ...
100 2018-02-25 31 0 57
101 2018-03-04 26 9 43
102 2018-03-11 16 6 53
103 2018-03-18 33 0 41
104 2018-03-25 39 16 53
[105 rows x 4 columns]
Week pgsus pgsus hisse pegasus hisse
0 2016-03-27 31 13 28
1 2016-04-03 30 14 31
2 2016-04-10 27 13 29
3 2016-04-17 42 5 49
4 2016-04-24 31 8 28
.. ... ... ... ...
100 2018-02-25 14 0 29
101 2018-03-04 17 16 35
102 2018-03-11 14 4 42
103 2018-03-18 23 6 34
104 2018-03-25 4 6 21
[105 rows x 4 columns]
Week tcell tcell hisse turkcell hisse
0 2016-03-27 19 14 18
1 2016-04-03 41 11 35
2 2016-04-10 28 8 17
3 2016-04-17 45 0 0
4 2016-04-24 38 0 15
.. ... ... ... ...
100 2018-02-25 22 14 17
101 2018-03-04 29 11 15
102 2018-03-11 45 10 38
103 2018-03-18 27 7 18
104 2018-03-25 23 9 25
[105 rows x 4 columns]
list_stocks = [df_akbnk, df_ccola, df_mgros, df_otkar, df_pgsus, df_tcell]
for stock in list_stocks:
stock['Week'] = pd.to_datetime(stock['Week'])
# Extracting the month and year information from the week column
stock['Year'] = stock['Week'].dt.year
stock['Month'] = stock['Week'].dt.month
# Grouping by month and taking the mean for each query
grouped_means = stock.groupby(['Year', 'Month']).mean()
# Line plot
plt.figure(figsize=(14, 7))
for column in grouped_means.columns:
if column not in ['Year', 'Month']:
grouped_means[column].plot(label=column)
plt.title("Mean Monthly Scores")
plt.xlabel("Date")
plt.ylabel("Mean Scores")
plt.xticks(range(len(grouped_means.index)), [f"({x[0]}, {x[1]})" for x in grouped_means.index], rotation=45)
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
import matplotlib.pyplot as plt
grouped_means_2_years_span = df_2_years_span.groupby(['Year', 'Month']).mean()
grouped_means_akbnk = df_akbnk.groupby(['Year', 'Month']).mean()
grouped_means_ccola = df_ccola.groupby(['Year', 'Month']).mean()
grouped_means_mgros = df_mgros.groupby(['Year', 'Month']).mean()
grouped_means_otkar = df_otkar.groupby(['Year', 'Month']).mean()
grouped_means_pgsus = df_pgsus.groupby(['Year', 'Month']).mean()
grouped_means_tcell = df_tcell.groupby(['Year', 'Month']).mean()
def plot_stock(df, stock_symbol, df_2_years_span):
# Grouping by Year and Month
grouped_means = df.groupby(['Year', 'Month']).mean()
# Plotting
plt.figure(figsize=(14, 7))
# The first y axis
ax1 = plt.gca()
# Lines for the Google Trends queries
for column in grouped_means.columns:
grouped_means[column].plot(ax=ax1, label=column)
# Second y axis
ax2 = ax1.twinx()
# Adding the mean stock prices to the second y axis
df_2_years_span[stock_symbol].plot(ax=ax2, label=stock_symbol, color='purple', linestyle='--')
# x axis arrangement
ax1.set_xticks(range(len(grouped_means.index)))
ax1.set_xticklabels([f"({x[0]}, {x[1]})" for x in grouped_means.index], rotation=45)
# Headings
ax1.set_title("Monthly Mean Values")
ax1.set_xlabel("Date")
ax1.set_ylabel("Mean Scores for Queries")
ax2.set_ylabel("Mean Stock Prices")
lines_1, labels_1 = ax1.get_legend_handles_labels()
lines_2, labels_2 = ax2.get_legend_handles_labels()
ax1.legend(lines_1 + lines_2, labels_1 + labels_2, loc=0)
ax1.grid(True)
plt.tight_layout()
plt.show()
# Define a list of stocks and their dataframes
stocks = {
'AKBNK': df_akbnk,
'CCOLA': df_ccola,
'MGROS': df_mgros,
'OTKAR': df_otkar,
'PGSUS': df_pgsus,
'TCELL': df_tcell
}
# For each stock, call the plot function
for stock_symbol, df in stocks.items():
plot_stock(df, stock_symbol, grouped_means_2_years_span)
The second part of the analysis examines whether there is a correlation between the amount of search of some keywords and the stock prices. In this process, Google Trends tool was very much helpful. For AKBNK case, the printed plots do not provide enough correlation that could be commented on. However, there is a seemingly positive correlation with the "akbnk" keyword for the first 12 months of the time period which can also be deducable from the below scatter plots as well. Following that, stock prices demonstrates a random behavior relative to the search keywords. There could be various factors that may have driven such a random correlation considering that people browse the internet with differing intentions.
The similar keywords are checked for CCOLA as in the case of AKBNK. Again, similar to the case of AKBNK, the plotted data does not exhibit an apparent correlation that can be interpreted. Yet, from the scatter plot of the correlation for "ccola" keyword, a cumulation of point is observed for the first part of the plot which may have suggest a positive correlation for the first 6-7 months of the time period. Consider that the observed patterns might be influenced by various factors, making it challenging to establish a definitive link between search trends and stock prices.
Just like in the previous cases, the data plotted for MGROS does not reveal a clear and interpretable correlation. The patterns we observed in the data may be influenced by various factors, making it difficult to establish a definitive link between search trends and stock prices.
There is not concrete evidence of a positive alignment between the keywords and the stock prices. For the first 9 months and the last 6 months of the period, OTKAR prices are seemingly in a positive correlation with the search amount of the "otkar" keyword. This positive correlation couls easily be observed from the scatter plot as well. Yet, the correlation does not continue in most months of 2017, which could indicate an external shock might be in question during that period.
The alignment between keywords and PGSUS stock prices does not yield a consistent positive relationship. While a positive correlation with the keyword "pgsus hisse" is noticeable for the first 12 months, the alignment, then, wanes in the rest of the chosen period. This observation is apparent both via looking at the above line graphs and below scatter plots. This can be considered as the result of some unforeseen factors during that period, disrupting the previously observed correlation between keyword search volume and stock prices, underscoring the intricate nature of market dynamics.
Since the observed patterns in the data may be shaped by various factors and speculations, the data visualization for TCELL does not present an interpretable correlation.
import matplotlib.pyplot as plt
# Stock list
hisse_listesi = ["AKBNK", "CCOLA", "MGROS", "OTKAR", "PGSUS", "TCELL"]
# Assigning data frames and the corresponding stocks
dataframes = {
"AKBNK": grouped_means_akbnk,
"CCOLA": grouped_means_ccola,
"MGROS": grouped_means_mgros,
"OTKAR": grouped_means_otkar,
"PGSUS": grouped_means_pgsus,
"TCELL": grouped_means_tcell,
}
for hisse in hisse_listesi:
merged_df = dataframes[hisse].merge(grouped_means_2_years_span[[hisse]], on=['Year', 'Month'], how='outer')
for column in dataframes[hisse].columns:
selected_columns_df = merged_df[[column, hisse]]
# Scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(selected_columns_df[column], selected_columns_df[hisse], color='blue', alpha=0.6)
# Headings
plt.title(f"{hisse} vs {column}")
plt.xlabel(f"{column}")
plt.ylabel(hisse)
plt.grid(True)
# Show the plot
plt.tight_layout()
plt.show()
import seaborn as sns
# Concatenating all dataframes into a single dataframe
all_data = pd.concat(dataframes.values(), keys=dataframes.keys())
# Calculating the correlation matrix for the concatenated dataframe
corr_matrix = all_data.corr()
# Creating a heatmap
plt.figure(figsize=(12, 8))
sns.color_palette("rocket_r", as_cmap=True)
sns.heatmap(corr_matrix, annot=True, fmt=".2f", cmap='Purples', linewidths=.5)
plt.title('Correlation Matrix Heatmap')
plt.xlabel('Columns')
plt.ylabel('Columns')
plt.tight_layout()
plt.show()
In conclusion, our comprehensive analysis of the stock prices for AKBNK, CCOLA, MGROS, OTKAR, PGSUS, and TCELL reveals helpful insights into stock price dynamics, influenced by a lot of distinct factors. This report provides a detailed analysis of stock price evolution, offering valuable insights for investors, analysts, and researchers seeking to learn the ever-changing world of finance. As we move forward, understanding the forces shaping stock prices remains essential for making informed decisions in the world of financial markets.
How can I draw control charts with 3 sigma limit for 24 months horizon? I want to draw the charts side by side. Please make up some data and draw the charts by using Python.
import numpy as np import matplotlib.pyplot as plt
Generate some sample data for demonstration np.random.seed(0) data = np.random.normal(0, 1, size=(24, 30)) # 24 months with 30 data points each
Calculate mean and standard deviation for each month monthly_means = np.mean(data, axis=1) monthly_stdevs = np.std(data, axis=1, ddof=1)
Set the number of sigma (standard deviations) for the control limits sigma = 3
Create the control charts fig, axs = plt.subplots(1, 24, figsize=(24, 4))
for i in range(24): x = np.arange(1, 31) # X-axis values (days) y = data[i]
control_limit = sigma * monthly_stdevs[i]
upper_limit = monthly_means[i] + control_limit
lower_limit = monthly_means[i] - control_limit
axs[i].plot(x, y, label='Data')
axs[i].axhline(upper_limit, color='r', linestyle='--', label=f'Upper {sigma}-sigma Limit')
axs[i].axhline(lower_limit, color='g', linestyle='--', label=f'Lower {sigma}-sigma Limit')
axs[i].set_title(f'Month {i + 1}')
axs[i].set_xlabel('Day')
axs[i].set_ylabel('Value')
axs[i].legend()
plt.tight_layout()
How to color outliers differently?
# Plot the data points, marking outliers differently
ax.plot(x[non_outliers], y[non_outliers], label='Data', marker='o', linestyle='-', color='b')
ax.plot(x[outliers], y[outliers], label='Outliers', marker='x', linestyle='', color='r')
give alternatives to tightlayout
If you want alternatives to tight_layout for adjusting the layout of your plots, you can consider the following methods:
subplots_adjust: This method allows you to fine-tune the spacing between subplots by adjusting parameters such as left, right, top, and bottom. Here's how to use it:
fig.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.1, wspace=0.2, hspace=0.2) constrained_layout: This is another built-in layout manager that automatically adjusts the subplots to avoid overlapping content. It is available in recent versions of Matplotlib (2.2.0 and later) and can be enabled as follows:
fig, axes = plt.subplots(num_rows, num_cols, figsize=(20, 20), sharex=False, sharey=False, constrained_layout=True) Manual Layout Adjustment: You can manually set the position and size of subplots using fig.add_axes() and specifying the position as a fraction of the figure size. This approach provides the most control over subplot positioning:
ax1 = fig.add_axes([0.1, 0.1, 0.4, 0.8]) # [left, bottom, width, height] ax2 = fig.add_axes([0.5, 0.1, 0.4, 0.8]) gridspec: If you need a more complex grid of subplots, you can use the gridspec functionality to create a grid of subplots with different sizes and positions. This provides greater flexibility in arranging subplots:
import matplotlib.gridspec as gridspec
gs = gridspec.GridSpec(2, 2) ax1 = plt.subplot(gs[0, 0]) ax2 = plt.subplot(gs[0, 1]) ax3 = plt.subplot(gs[1, :]) These alternatives allow you to control the layout of your subplots based on your specific requirements. Choose the one that best suits your needs for arranging and spacing the subplots in your control chart.
I have a data including 6 stocks in BIST for 2-year horizon for every 15 minutes. I need to draw box plots for each month. For every stock I need to have 24 box plots side by side. Can you do that
I can help you create box plots for each month, but I can't create the plots themselves as I can't generate graphics directly. However, I can guide you through the steps to create these box plots using Python and a library like Matplotlib or Seaborn. Here's a general outline of how you can achieve this:
Data Preparation:
Organize your stock price data into a format that's suitable for creating box plots. It should include columns for the stock symbol, date, and price. Data Filtering:
Filter your data to include only the two years of interest. Monthly Aggregation:
Group your data by month, and for each stock, calculate statistics like the quartiles (Q1 and Q3) and the median (Q2) for each month. You can use Python's pandas library for this. Creating Box Plots:
Use a data visualization library such as Matplotlib or Seaborn to create the box plots. Here's some example Python code to help you get started using Matplotlib and Pandas:
import pandas as pd import matplotlib.pyplot as plt
Assuming you have a DataFrame called 'stock_data' with columns 'Date', 'Stock', and 'Price' Filter data for the 2-year horizon start_date = 'YYYY-MM-DD' # Replace with your actual start date end_date = 'YYYY-MM-DD' # Replace with your actual end date filtered_data = stock_data[(stock_data['Date'] >= start_date) & (stock_data['Date'] <= end_date)]
Extract the year and month from the Date column filtered_data['Year'] = filtered_data['Date'].dt.year filtered_data['Month'] = filtered_data['Date'].dt.month
Group data by Stock, Year, and Month grouped_data = filtered_data.groupby(['Stock', 'Year', 'Month'])
Create box plots for each stock and each month for (stock, year, month), group in grouped_data: plt.figure(figsize=(8, 6)) plt.title(f'Box Plot for {stock} - {year}-{month:02}') plt.boxplot(group['Price'], labels=[f'{year}-{month:02}']) plt.xlabel('Month') plt.ylabel('Stock Price') plt.show() Remember to replace 'YYYY-MM-DD' with your actual start and end dates and make sure you have the necessary libraries (Pandas and Matplotlib) installed. This code will generate box plots for each stock for each month in the specified 2-year horizon. You can customize the plot appearance and labels as needed.
Can you analyze Turkey's economy in 2017
Certainly, I can provide an analysis of Turkey's economy in 2017. Please note that my knowledge is based on information available up to January 2022, so I don't have access to more recent data or developments. In 2017, Turkey's economy can be analyzed as follows:
Growth: Turkey experienced robust economic growth in 2017, with a Gross Domestic Product (GDP) growth rate of approximately 7.4%. This was one of the fastest-growing periods in the country's recent history.
Inflation: In 2017, Turkey faced rising inflation rates. The annual consumer price inflation was around 11%, indicating high levels of price increases and reduced consumer confidence.
Exchange Rates: The Turkish Lira (TRY) experienced significant depreciation against the US Dollar in 2017. Particularly before the currency crisis in August 2018, the TRY had been steadily losing value.
Unemployment: The unemployment rate was a significant concern in 2017. Both overall and youth unemployment rates were relatively high, which indicated challenges in the labor market.
Political Uncertainty: Political uncertainty was a prominent issue in Turkey in 2017. Constitutional changes following the referendum had an impact on the political climate, and there was a sense of growing polarization and uncertainty.
External Relations: Turkey's external relations were marked by tensions, including those with the United States and the European Union. Regional conflicts, such as the situation in Syria, also had an impact on the country's economic environment.
Reforms and Policies: The government announced various economic reform packages in an attempt to address some of the economic challenges, but their effectiveness and implementation were subjects of debate.
Overall, 2017 was a year of robust economic growth in Turkey, but it also raised concerns due to high inflation, currency depreciation, rising unemployment, and political uncertainties. These economic and political factors set the stage for some of the subsequent economic developments and challenges the country faced in the following years.
How can i analyze the scatter plot of a correlation
Analyzing a scatter plot to assess the correlation between two variables is a fundamental step in understanding the relationship between them. Here's a step-by-step guide on how to analyze a scatter plot for correlation:
Create the Scatter Plot:
Visual Inspection:
Correlation Direction:
Strength of Correlation:
Outliers:
Line of Best Fit:
Correlation Coefficient:
Residual Analysis:
Domain Knowledge:
Statistical Significance:
Additional Analyses:
In summary, analyzing a scatter plot for correlation involves both visual inspection and quantitative analysis. By examining the direction, strength, outliers, and potential domain-specific factors, you can gain a comprehensive understanding of the relationship between the variables in question.
can you write the following as a markdown notation in Jupyter NOtebook : For a company wise, more specific, analysis, one can carefully focus on the output and determine the factors or events that may have played a critical role in price fluctuations. Having said that, by looking at the overall pace for AKBNK, lowest prices are observed from Nov 2016 to Jan 2017. One of the reasons behind of such an incline could be a critical event such as Moody’s downgrade of Turkey’s sovereign credit rating in Sep 2016. In order to delve deep, another focus point could be an analysis of outliers. These outliers represent instances where stock prices deviated significantly from the expected pace as determined by the 3-sigma rule. Following that, we have determined the outliers for the stock prices of AKBNK as follows: [6.8525, 6.7483], [6.6681], [6.7563, 6.7322], [7.4135], [7.4769, 7.4769], [9.2124] These outliers belong to May, Nov 2016, Jan, Mar, Oct 2017 and Jan 2018, respectively. For each identified outlier, a thorough analysis was conducted to analyze the underlying factors that might have contributed to these price movements. For example, there are some outliers in Jan 2017 performing better than the expected behavior of the stock prices. With a simple desktop research, we have found out that Akbank has been selected as the “Most Successful Brand” within the banking industry by the ONE Awards. Although this can not be stated as the sole reason for such a fluctuation in the stock market, it may be a good starting point for the understanding of market dynamics. The second part of the analysis examines whether there is a correlation between the amount of search of some keywords and the stock prices. In this process, Google Trends tool was very much helpful. For AKBNK case, the printed plots do not provide an enough correlation that could be commented on. There could be various factors that may have driven such a random correlation. Moody's downgrades Turkey's sovereign rating, markets shrug By Reuters (investing.com) 2016’da “Yılın Banka Markası” Akbank! - Dosya Haber
Certainly, here's the provided text formatted in Markdown notation for a Jupyter Notebook:
#Company-Wise Analysis
For a company-wise, more specific analysis, one can carefully focus on the output and determine the factors or events that may have played a critical role in price fluctuations. Having said that, by looking at the overall pace for AKBNK, lowest prices are observed from Nov 2016 to Jan 2017. One of the reasons behind such an incline could be a critical event such as Moody’s downgrade of Turkey’s sovereign credit rating in Sep 2016.
In order to delve deep, another focus point could be an analysis of outliers. These outliers represent instances where stock prices deviated significantly from the expected pace as determined by the 3-sigma rule. Following that, we have determined the outliers for the stock prices of AKBNK as follows: [6.8525, 6.7483], [6.6681], [6.7563, 6.7322], [7.4135], [7.4769, 7.4769], [9.2124]. These outliers belong to May, Nov 2016, Jan, Mar, Oct 2017, and Jan 2018, respectively. For each identified outlier, a thorough analysis was conducted to analyze the underlying factors that might have contributed to these price movements. For example, there are some outliers in Jan 2017 performing better than the expected behavior of the stock prices. With a simple desktop research, we have found out that Akbank has been selected as the “Most Successful Brand” within the banking industry by the ONE Awards. Although this cannot be stated as the sole reason for such a fluctuation in the stock market, it may be a good starting point for the understanding of market dynamics.
#Correlation Analysis
The second part of the analysis examines whether there is a correlation between the amount of search of some keywords and the stock prices. In this process, Google Trends tool was very much helpful. For AKBNK case, the printed plots do not provide enough correlation that could be commented on. There could be various factors that may have driven such a random correlation.
##References
- [Moody's downgrades Turkey's sovereign rating, markets shrug By Reuters (investing.com)](link-to-article1)
- [2016’da “Yılın Banka Markası” Akbank! - Dosya Haber](link-to-article2)
Make sure to replace "link-to-article1" and "link-to-article2" with the actual links to the referenced articles or sources. This Markdown code can be used in a Jupyter Notebook to present your analysis.